Data sources

1. Where the Data was found

The data was sourced from the Victoria Health Coronavirus website, which stores the data of all the coronavirus cases throughout Victoria since 2020. They have data sets available for outsiders to download and analyze, as well as post updated information frequently.

Of the data sets available for download the data chosen for this analysis was Victorian case data that contains ‘all data of PCR and RAT cases by Local Government Area and postcode’. This is primarily because this data set contains more variables thereby giving more flexibility and control for the analysis and results. It will help bolster the objective of this analysis and help improve understanding of the results provided.

This data is protected by copyright under a Creative Commons Attribution Version 4.0, which is an international license. Therefore, when using this data we must be mindful of the restrictions laid out in the aforementioned license.

2. Observational or Experimental Data

This is a observational data as the Victoria Health department specified that they gathered their data through contact tracing and management of the COVID-19 outbreak. This is indicative that researcher or health professionals do not control the factors affecting the variables, meaning that the variables themselves are independent, but rather they attempt to determine correlation of factors. In this scenario the independent variables are the coronavirus cases or people affected by the coronavirus.

4. Unit and potential Unique Identifier

In this data set the “postcode” can be considered a unique identifier as it can easily identify an individual especially when combined with the diagnosis date and case count.

5. Storing and Analysing the Data

The data set in its current form can be stored and used for analysis of the research question, however the only variables essential to this analysis is the date of diagnosis and the number of cases on those days. Therefore, to keep the report concise and accurate we will remove all columns except “diagnosis_date” as well as create a new column “total_cases” to give the total number of cases on a given day within the period of interest.

🔍 Analysis

6. Daily Cases over Time of Interest

Time series of Covid cases from 30th June to the 13th of September 2020

Time series of Covid cases from 30th June to the 13th of September 2020

7. 7-Day Growth Rate

Growth Rate of Covid cases within the lockdown period of 30th June to 29th July 2020

Growth Rate of Covid cases within the lockdown period of 30th June to 29th July 2020

Since we are concerned with effects of local lockdowns, the above growth rate considers the lockdown duration within the investigation period. According to Davey (2020), the lockdown at that point in time was imposed from the 30th of June to the 29th of July 2020. What we notice from Figure @ref{fig:GR} is that although there was an initial spike in the growth rate of cases, it subsided towards the end of the lockdown. This means the lockdown was relatively effective in this period.

Trial :

📉 Data curation

The criteria I am following in this section is the one laid out by Broman and Woo (2018).

  1. The first thing is to be consistent with the organization of your data. In this Covid-19 data set the variables consistently used were “diagnosed_date” and “Total_cases”.

  2. It is important to choose good and meaningful names for variable and file names. In this data set “diagnosed_data” represents the date a patient was diagnosed to be positive with Covid-19, while “Total_cases” represents the total number of cases diagnosed on each date.

  3. When writing dates it is

Resources

Broman, Karl W., and Kara H. Woo. 2018. “Data Organization in Spreadsheets.” The American Statistician 72 (1): 2–10. https://doi.org/10.1080/00031305.2017.1375989.
Davey, Melissa. 2020. “Melbourne Suburbs Lockdown Announced as Victoria Battles Coronavirus Outbreaks.” The Guardian. https://www.theguardian.com/australia-news/2020/jun/30/melbourne-hotspot-lockdowns-announced-as-victoria-battles-coronavirus-outbreaks.
Spinu, Vitalie, Garrett Grolemund, and Hadley Wickham. 2023. Lubridate: Make Dealing with Dates a Little Easier. https://CRAN.R-project.org/package=lubridate.
Wickham, Hadley. 2023. Tidyverse: Easily Install and Load the Tidyverse. https://CRAN.R-project.org/package=tidyverse.
Wickham, Hadley, Winston Chang, Lionel Henry, Thomas Lin Pedersen, Kohske Takahashi, Claus Wilke, Kara Woo, Hiroaki Yutani, and Dewey Dunnington. 2023. Ggplot2: Create Elegant Data Visualisations Using the Grammar of Graphics. https://CRAN.R-project.org/package=ggplot2.
Xie, Yihui. 2023. Bookdown: Authoring Books and Technical Documents with r Markdown. https://CRAN.R-project.org/package=bookdown.